Efficient Information Extraction Based on Signature Index

نویسندگان

  • Minghui Wu
  • Zemin Liu
  • Shiwen Cheng
چکیده

Information Extraction (IE) is an essential tool to retrieve structured information from at text (including web-content).Given a pattern query, an ideal IE application should be able to extract the matched target effectively and efficiently. However, as far as we know, efficiency and flexibility are major concern for typical IE tools since they either use brute-force document parsing for each query off-line or support on-line query in pre-extracted elements. In order to promote accuracy and efficiency of extraction, in this paper we propose a novel framework iExtractor that leverages In-formation Retrieval (IR) indexes to speed-up IE processes. We index text blocks with their signatures (presented as bit-strings) and propose efficient IE algorithms based on the signature index. Hence, iExtractor can validate query pattern in signature index without original text. The framework also supports on-line extraction through a general and flexible pattern extraction language. Our extensive experimental results on diverse real datasets show that our approach delivers stable efficiency and has outperforms baselines in terms of extraction accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An efficient blind signature scheme based on the elliptic curve discrete logarithm problem

Elliptic Curve Cryptosystems (ECC) have recently received significant attention by researchers due to their high performance such as low computational cost and small key size. In this paper a novel untraceable blind signature scheme is presented. Since the security of proposed method is based on difficulty of solving discrete logarithm over an elliptic curve, performance of the proposed scheme ...

متن کامل

An ECC-Based Mutual Authentication Scheme with One Time Signature (OTS) in Advanced Metering Infrastructure

Advanced metering infrastructure (AMI) is a key part of the smart grid; thus, one of the most important concerns is to offer a secure mutual authentication.  This study focuses on communication between a smart meter and a server on the utility side. Hence, a mutual authentication mechanism in AMI is presented based on the elliptic curve cryptography (ECC) and one time signature (OTS) consists o...

متن کامل

Convertible limited (multi-) verifier signature: new constructions and applications

A convertible limited (multi-) verifier signature (CL(M)VS) provides controlled verifiability and preserves the privacy of the signer. Furthermore, limited verifier(s) can designate the signature to a third party or convert it into a publicly verifiable signature upon necessity. In this proposal, we first present a generic construction of convertible limited verifier signature (CLVS) into which...

متن کامل

The new protocol blind digital signature based on the discrete logarithm problem on elliptic curve

In recent years it has been trying that with regard to the question of computational complexity of discrete logarithm more strength and less in the elliptic curve than other hard issues, applications such as elliptic curve cryptography, a blind  digital signature method, other methods such as encryption replacement DLP. In this paper, a new blind digital signature scheme based on elliptic curve...

متن کامل

Face Recognition Based Rank Reduction SVD Approach

Standard face recognition algorithms that use standard feature extraction techniques always suffer from image performance degradation. Recently, singular value decomposition and low-rank matrix are applied in many applications,including pattern recognition and feature extraction. The main objective of this research is to design an efficient face recognition approach by combining many tech...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015